feat(fuzz): mutation fuzzer with schema-conformance oracle#9
Conversation
A small mutation-based fuzzer that runs the validator against AST-mutated
copies of real-world OpenAPI specs and checks two oracles:
1. No crashes (validate_request must not throw a Lua error).
2. Schema conformance — a request generated to satisfy an operation's
schema must be accepted by the validator.
This is the productionised form of the harness used during v1.0.3 QA.
Locally it reproduces the path-extension Bug 1 against pre-fix v1.0.3
and the utf8_len(table) Bug 3 against unpatched jsonschema, and is
clean against the current main + jsonschema main.
Wired into CI:
- fuzz.yml — runs on every PR / push to main, 120s budget.
Fails the job on any crash or candidate
false-negative; uploads fuzz/out/ as an artifact.
- fuzz-nightly.yml — runs daily at 18:00 UTC, 600s budget. On failure
uploads findings, then opens (or comments on) a
fuzz-nightly tracking issue assigned to @jarvis9443.
See fuzz/README.md for architecture, mutator list, and how to extend.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a mutation-based OpenAPI fuzzer: Python harness, seed spec, README, Makefile integration, .gitignore updates, and two GitHub Actions workflows (PR/main and nightly) that install OpenResty/LuaRocks, run the fuzzer with configurable budget, upload artifacts, and create/comment issues on failures. Changes
Sequence Diagram(s)sequenceDiagram
participant PythonFuzzer as Python Fuzzer
participant FS as File System
participant Resty as Resty CLI
participant Lua as Lua Validator
PythonFuzzer->>FS: Load seed OpenAPI spec
PythonFuzzer->>PythonFuzzer: Apply AST mutations & generate request cases
loop per batch/request
PythonFuzzer->>Resty: Invoke resty CLI (spec + JSONL cases via stdin)
Resty->>Lua: Compile spec and call v.validate_request inside pcall
Lua-->>Resty: Emit validation result or Lua error (JSONL)
Resty-->>PythonFuzzer: Stream JSONL results
end
PythonFuzzer->>FS: Write fuzz/out/crashes.jsonl and fuzz/out/summary.json
PythonFuzzer->>PythonFuzzer: Exit non-zero if crashes or non-noisy false-negatives found
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
Makefile (1)
10-10:⚠️ Potential issue | 🟠 MajorAdd
fuzzto.PHONYdeclaration.The
fuzztarget is not listed in.PHONY, but afuzz/directory exists. Make will consider the target up-to-date based on the directory's existence, causingmake fuzzto skip execution.🐛 Proposed fix
-.PHONY: test test-unit test-conformance lint dev install clean help +.PHONY: test test-unit test-conformance lint dev install clean help fuzz🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` at line 10, The .PHONY declaration is missing the fuzz target so Make may treat the existing fuzz/ directory as an up-to-date target; update the .PHONY line (the .PHONY declaration that currently lists test test-unit test-conformance lint dev install clean help) to also include fuzz so the fuzz target always runs regardless of the fuzz/ directory's existence.
🧹 Nitpick comments (4)
fuzz/mutate_fuzz.py (3)
247-248:max_per_opparameter is unused.The
max_per_opparameter is declared but never used in the function body. Either implement the limiting logic or remove the parameter.♻️ Option 1: Remove unused parameter
-def gen_cases(spec: dict, rng: random.Random, max_per_op: int = 2) -> list[dict]: +def gen_cases(spec: dict, rng: random.Random) -> list[dict]:And update the call site at line 426:
- cases = gen_cases(spec, rng, max_per_op=2) + cases = gen_cases(spec, rng)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@fuzz/mutate_fuzz.py` around lines 247 - 248, The gen_cases function declares max_per_op but never uses it; either implement per-operation limiting inside gen_cases by counting or slicing generated cases for each operation (use the function name gen_cases and the parameter max_per_op to locate where to apply the limit) so no more than max_per_op cases are returned per op, or remove the max_per_op parameter from gen_cases and update all call sites that pass that argument to call the new signature (search for gen_cases(...) calls to update). Ensure whichever path you take keeps the function signature and callers consistent.
351-352: Use explicitOptionaltype hint.PEP 484 prohibits implicit
Optional. Theextra_includesparameter should use explicit union syntax.♻️ Proposed fix
-def run_validator(spec: dict, cases: list[dict], deps: str, lib: str, - extra_includes: list[str] = None, timeout: float = 30.0): +def run_validator(spec: dict, cases: list[dict], deps: str, lib: str, + extra_includes: list[str] | None = None, timeout: float = 30.0):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@fuzz/mutate_fuzz.py` around lines 351 - 352, The type hint for run_validator's extra_includes is currently implicit None; change it to an explicit Optional type (e.g., Optional[List[str]]) and ensure typing.Optional (or from typing import Optional, List) is imported so the signature becomes run_validator(..., extra_includes: Optional[List[str]] = None, ...); update the function definition and add the necessary typing import if missing to satisfy PEP 484.
370-374: Catch specific exception instead of bareException.Silently swallowing all exceptions masks unexpected errors. Since you're parsing JSON, catch
json.JSONDecodeErrorspecifically.♻️ Proposed fix
try: out.append(json.loads(line)) - except Exception: + except json.JSONDecodeError: pass🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@fuzz/mutate_fuzz.py` around lines 370 - 374, Replace the bare except Exception around the json.loads(line) call with an except json.JSONDecodeError to avoid swallowing unrelated errors: locate the try/except that wraps json.loads(line) (which appends to out) and change the exception handler to only catch json.JSONDecodeError so malformed JSON lines are skipped while other exceptions propagate; ensure json is imported where mutate_fuzz.py uses json.loads if not already.fuzz/README.md (1)
19-27: Add language specifier to fenced code block.The architecture diagram code block lacks a language specifier. While it's ASCII art, adding
textorplaintextsatisfies linters and improves rendering consistency.📝 Suggested fix
-``` +```text mutate_fuzz.py (Python orchestrator)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@fuzz/README.md` around lines 19 - 27, The fenced code block that shows the architecture diagram (the block starting with "mutate_fuzz.py (Python orchestrator)") lacks a language specifier; update the opening fence from ``` to ```text (or ```plaintext) so linters/renderers recognize it as plain text and the README's code block renders consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/fuzz-nightly.yml:
- Around line 40-42: The "Install LuaRocks" step currently pipes an external
script into sh (curl ... | sh), which creates a supply-chain risk; change the
step so it first downloads the script (the command invoked in the Install
LuaRocks step), pins a specific commit or tag for the script URL, verifies its
integrity (e.g., compare a checked-in or workflow-provided SHA256 or GPG
signature) and only then executes the verified file, or alternatively vendor a
known-good installer into the repo or use an official action/installer instead
of piping to shell; update the Install LuaRocks step to reference the pinned URL
and add explicit verification before execution.
In @.github/workflows/fuzz.yml:
- Around line 32-34: The GitHub Actions step that installs LuaRocks currently
pipes an external script to sh (the "Install LuaRocks" run step), which is a
supply-chain risk; replace that by either vendoring the apache/apisix
utils/linux-install-luarocks.sh script into this repo and invoking the local
copy, or pin the curl URL to a specific commit SHA (use
raw.githubusercontent.com/.../<commit_sha>/utils/linux-install-luarocks.sh)
before executing it; if you vendor the script, add a short comment in the
workflow referencing the original upstream URL and commit SHA and add a periodic
checklist entry to review upstream updates so the workflow uses a known-good,
auditable script rather than an unpinned remote one.
In `@fuzz/README.md`:
- Around line 85-87: The README's Nightly entry references the GitHub handle
"@jarvis-api7" which is inconsistent with the actual assignee "jarvis9443" used
in .github/workflows/fuzz-nightly.yml; update the README.md Nightly line (the
string "@jarvis-api7") to the correct handle "jarvis9443" (or alternatively
update the workflow to match the README) so both the Nightly description and the
fuzz-nightly.yml assignee are the same.
- Around line 12-15: The markdown link
"../../../qa/lua-resty-openapi-validator-v1.0.3.md" in the README is a broken
relative path; update that link to point to the correct location of the QA
document (use the actual repo-relative path or an absolute URL), or if the QA
doc doesn't exist or is external, remove the link or replace it with a valid URL
and adjust the surrounding text accordingly so the reference no longer 404s.
---
Outside diff comments:
In `@Makefile`:
- Line 10: The .PHONY declaration is missing the fuzz target so Make may treat
the existing fuzz/ directory as an up-to-date target; update the .PHONY line
(the .PHONY declaration that currently lists test test-unit test-conformance
lint dev install clean help) to also include fuzz so the fuzz target always runs
regardless of the fuzz/ directory's existence.
---
Nitpick comments:
In `@fuzz/mutate_fuzz.py`:
- Around line 247-248: The gen_cases function declares max_per_op but never uses
it; either implement per-operation limiting inside gen_cases by counting or
slicing generated cases for each operation (use the function name gen_cases and
the parameter max_per_op to locate where to apply the limit) so no more than
max_per_op cases are returned per op, or remove the max_per_op parameter from
gen_cases and update all call sites that pass that argument to call the new
signature (search for gen_cases(...) calls to update). Ensure whichever path you
take keeps the function signature and callers consistent.
- Around line 351-352: The type hint for run_validator's extra_includes is
currently implicit None; change it to an explicit Optional type (e.g.,
Optional[List[str]]) and ensure typing.Optional (or from typing import Optional,
List) is imported so the signature becomes run_validator(..., extra_includes:
Optional[List[str]] = None, ...); update the function definition and add the
necessary typing import if missing to satisfy PEP 484.
- Around line 370-374: Replace the bare except Exception around the
json.loads(line) call with an except json.JSONDecodeError to avoid swallowing
unrelated errors: locate the try/except that wraps json.loads(line) (which
appends to out) and change the exception handler to only catch
json.JSONDecodeError so malformed JSON lines are skipped while other exceptions
propagate; ensure json is imported where mutate_fuzz.py uses json.loads if not
already.
In `@fuzz/README.md`:
- Around line 19-27: The fenced code block that shows the architecture diagram
(the block starting with "mutate_fuzz.py (Python orchestrator)") lacks a
language specifier; update the opening fence from ``` to ```text (or
```plaintext) so linters/renderers recognize it as plain text and the README's
code block renders consistently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4558101f-da50-48d4-b4a2-190146535dc5
📒 Files selected for processing (8)
.github/workflows/fuzz-nightly.yml.github/workflows/fuzz.ymlMakefilefuzz/.gitignorefuzz/README.mdfuzz/mutate_fuzz.pyfuzz/seeds/discourse.jsonfuzz/seeds/notion.json
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 8 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Remove unused 'import copy' (Thread 6) - Fix module docstring: remove unimplemented mutators (scalar↔array, $ref cycle), clarify only positive cases are generated (Thread 5, 12) - Enforce max_per_op limit in gen_cases (Thread 7) - Separate crash_count and false_negative_count in summary (Thread 8) - Fix README: mutator contract returns bool not string (Thread 9) - Fix README: broken QA doc link removed (Thread 3) - Fix README: align LITERAL_EXTS with actual code (Thread 13) - Fix README: @jarvis-api7 → @jarvis9443 (Thread 4, 10) - Redact API key in discourse.json seed (Thread 11) - Replace curl-pipe-sh LuaRocks install with pinned tarball (Thread 1, 2) - Update summary.json format in README docs (Thread 8)
The previous 3.11.1 tarball from luarocks.org failed because LuaJIT hits the 65536 constants limit when parsing the large manifest file. Switch to the same approach used by apache/apisix CI: luarocks 3.12.0 from GitHub releases with proper OpenSSL path configuration.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/fuzz-nightly.yml:
- Around line 50-53: The Install Lua dependencies step currently installs
floating versions of jsonschema and lua-resty-radixtree; change the commands to
pin exact versions (e.g. replace `sudo luarocks install jsonschema` and `sudo
luarocks install lua-resty-radixtree` with `sudo luarocks install jsonschema
<version>` and `sudo luarocks install lua-resty-radixtree <version>`) or use
`luarocks install --pin` to produce a lockfile and install from that to make the
nightly job reproducible; update the step that runs these commands (the "Install
Lua dependencies" run block) to include the chosen version strings or the --pin
workflow and commit the generated lockfile.
- Around line 22-23: Ensure FUZZ_BUDGET is validated as a safe numeric value and
quoted when passed to the shell: validate env.FUZZ_BUDGET in the run step (e.g.,
reject or default if it does not match ^[0-9]+$) and call the target with
quoting like make fuzz FUZZ_BUDGET="$FUZZ_BUDGET" so user-controlled
workflow_dispatch.inputs.budget cannot inject shell metacharacters; reference
the FUZZ_BUDGET variable and the make fuzz invocation for where to add the
numeric check and the quoted use.
In `@fuzz/mutate_fuzz.py`:
- Around line 426-432: The spec loaded into variable spec is mutated then passed
to gen_cases before $ref resolution, causing referenced schemas to be collapsed
to fallbacks; resolve all $ref references on spec (e.g., via the same OV/OpenAPI
compile/resolution used by run_validator/ov.compile or your existing resolver)
immediately after mutate(...) and before calling gen_cases(spec, ...), so
gen_cases operates on the fully-resolved spec; ensure the resolver mutates or
returns a resolved spec and use that resolved spec for gen_cases and later
run_validator.
- Around line 423-491: The loop can raise exceptions and currently skips writing
summary.json; wrap the main fuzz loop (the block using crashes_path.open and
iterating while time.time() - t0 < args.budget, which updates rounds, cases_run,
crashes, false_negatives) in a try/finally (or try/except/finally) so that in
the finally block you build the summary dict using the current rounds,
cases_run, t0, crashes, false_negatives and always call
summary_path.write_text(json.dumps(summary, indent=2)) and print it before
exiting; ensure variables referenced (rounds, cases_run, t0, crashes,
false_negatives, crashes_path) are in scope for the finally block and preserve
the same exit code logic (sys.exit(1 if (crashes or false_negatives) else 0)).
- Around line 209-242: sample_value() currently only honors enums inside
_sample_string(), causing integer/number/boolean values to ignore schema["enum"]
and produce invalid samples; fix by checking for schema.get("enum") near the top
of sample_value (after the nullable check and before the type-specific branches)
and, if present, return rng.choice(schema["enum"]) (handling nullability if enum
contains None or schema.get("nullable") is true) so all types respect enum
constraints; keep the existing _sample_string() behavior but remove its
special-case-only enum handling.
- Line 46: Remove "simple" from the ARRAY_STYLES list so m_param_style() no
longer selects an invalid style for query parameters; update the constant
ARRAY_STYLES to only include "form", "pipeDelimited", and "spaceDelimited" and
ensure any references in m_param_style() continue to use that constant for
random selection of query parameter styles.
In `@fuzz/README.md`:
- Around line 18-26: The fenced code block in the README describing the fuzz
orchestrator is unlabeled which triggers markdownlint MD040; update the fence by
adding a language tag (e.g., "text") to the opening triple-backticks around the
mutate_fuzz.py architecture block so the block is labeled (refer to the block
that mentions mutate_fuzz.py, fuzz/seeds/, RUNNER_LUA and the JSONL output) and
re-run lint to confirm the MD040 violation is resolved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d63b5207-39b3-45df-b968-59be4ac1a784
📒 Files selected for processing (5)
.github/workflows/fuzz-nightly.yml.github/workflows/fuzz.ymlfuzz/README.mdfuzz/mutate_fuzz.pyfuzz/seeds/discourse.json
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/fuzz.yml
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
.github/workflows/fuzz-nightly.yml (3)
22-23:⚠️ Potential issue | 🟠 MajorValidate and quote
FUZZ_BUDGETbefore invokingmake.
workflow_dispatch.inputs.budgetis user-controlled text. Expanding it unquoted on Line 73 lets shell metacharacters change what runs on the runner. Sincefuzz/mutate_fuzz.pyalready accepts a numeric budget, validate that format first and then quote the make assignment.🛡️ Proposed hardening
- name: Run mutation fuzzer id: fuzz continue-on-error: true run: | export PATH=$OPENRESTY_PREFIX/nginx/sbin:$OPENRESTY_PREFIX/bin:$PATH - make fuzz FUZZ_BUDGET=$FUZZ_BUDGET + if ! printf '%s\n' "$FUZZ_BUDGET" | grep -Eq '^[0-9]+([.][0-9]+)?$'; then + echo "FUZZ_BUDGET must be a numeric number of seconds" >&2 + exit 2 + fi + make fuzz "FUZZ_BUDGET=$FUZZ_BUDGET"Also applies to: 71-73
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/fuzz-nightly.yml around lines 22 - 23, Validate that the user-provided workflow input for FUZZ_BUDGET is a positive integer and fail fast if it isn't, then ensure the value is quoted when passed to the make invocation; specifically, add a step that checks the workflow_dispatch input (the FUZZ_BUDGET value) matches a numeric regex (e.g. only digits), set a sanitized variable with that validated value, and update the make invocation to use the quoted variable (e.g., FUZZ_BUDGET="${{ env.SANITIZED_FUZZ_BUDGET }}" or similar) so shell metacharacters cannot be expanded.
42-47:⚠️ Potential issue | 🟠 MajorVerify the LuaRocks tarball before building it.
Pinning
LUAROCKS_VERhelps, but this still builds whatever the remote archive serves and then installs it withsudo. Please verify a pinned checksum or signature beforemake build.🛡️ Proposed hardening
LUAROCKS_VER=3.12.0 - wget -q "https://github.com/luarocks/luarocks/archive/v${LUAROCKS_VER}.tar.gz" - tar xzf "v${LUAROCKS_VER}.tar.gz" + LUAROCKS_TARBALL="luarocks-${LUAROCKS_VER}.tar.gz" + LUAROCKS_SHA256="<expected-sha256>" + curl -fsSLo "$LUAROCKS_TARBALL" \ + "https://github.com/luarocks/luarocks/archive/refs/tags/v${LUAROCKS_VER}.tar.gz" + echo "${LUAROCKS_SHA256} ${LUAROCKS_TARBALL}" | sha256sum -c - + tar xzf "$LUAROCKS_TARBALL" cd "luarocks-${LUAROCKS_VER}" ./configure --with-lua=$OPENRESTY_PREFIX/luajit make build && sudo make install - cd .. && rm -rf "luarocks-${LUAROCKS_VER}" "v${LUAROCKS_VER}.tar.gz" + cd .. && rm -rf "luarocks-${LUAROCKS_VER}" "$LUAROCKS_TARBALL"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/fuzz-nightly.yml around lines 42 - 47, The workflow currently downloads and builds the LUAROCKS tarball without integrity verification; change the steps around LUAROCKS_VER so the job first fetches the tarball and its checksum or signature (use the repo's .sha256/.sha512 or .asc signature), verify the archive (e.g., sha256sum -c against the pinned checksum or gpg --verify against a trusted key) and only if verification succeeds proceed to tar xzf, cd "luarocks-${LUAROCKS_VER}", ./configure --with-lua=$OPENRESTY_PREFIX/luajit and make build && sudo make install; fail the job on checksum/signature mismatch so unverified archives are never built/installed.
63-66:⚠️ Potential issue | 🟠 MajorPin the Lua rock versions used by nightly CI.
These installs float to whatever versions are latest that day, so a nightly failure can come from upstream drift rather than this repo. Please install exact known-good versions or switch this step to a lockfile-driven flow.
📌 Minimal fix
- sudo luarocks install jsonschema - sudo luarocks install lua-resty-radixtree + sudo luarocks install jsonschema <known-good-version> + sudo luarocks install lua-resty-radixtree <known-good-version>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/fuzz-nightly.yml around lines 63 - 66, The Install Lua dependencies step currently installs floating versions of jsonschema and lua-resty-radixtree; change it to install exact pinned versions or switch to a lockfile-driven install. Update the commands that install jsonschema and lua-resty-radixtree to include precise version specifiers (pin known-good versions) or replace the step with a luarocks lockfile install flow (using a generated luarocks.lock and invoking luarocks to install from it). Ensure you modify the "Install Lua dependencies" step to reference the rock names jsonschema and lua-resty-radixtree with those pinned versions or the lockfile-based installation command.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/fuzz-nightly.yml:
- Around line 114-116: The variable existing is being set from the gh issue list
pipeline and ends up containing the string "null" when no issues are returned;
update the jq expression in the assignment to use the fallback operator so null
becomes empty (e.g., change the jq filter '.[0].number' to '.[0].number //
empty') so that existing is an empty string when no issue is found and the
subsequent if [ -n "$existing" ] check behaves correctly.
---
Duplicate comments:
In @.github/workflows/fuzz-nightly.yml:
- Around line 22-23: Validate that the user-provided workflow input for
FUZZ_BUDGET is a positive integer and fail fast if it isn't, then ensure the
value is quoted when passed to the make invocation; specifically, add a step
that checks the workflow_dispatch input (the FUZZ_BUDGET value) matches a
numeric regex (e.g. only digits), set a sanitized variable with that validated
value, and update the make invocation to use the quoted variable (e.g.,
FUZZ_BUDGET="${{ env.SANITIZED_FUZZ_BUDGET }}" or similar) so shell
metacharacters cannot be expanded.
- Around line 42-47: The workflow currently downloads and builds the LUAROCKS
tarball without integrity verification; change the steps around LUAROCKS_VER so
the job first fetches the tarball and its checksum or signature (use the repo's
.sha256/.sha512 or .asc signature), verify the archive (e.g., sha256sum -c
against the pinned checksum or gpg --verify against a trusted key) and only if
verification succeeds proceed to tar xzf, cd "luarocks-${LUAROCKS_VER}",
./configure --with-lua=$OPENRESTY_PREFIX/luajit and make build && sudo make
install; fail the job on checksum/signature mismatch so unverified archives are
never built/installed.
- Around line 63-66: The Install Lua dependencies step currently installs
floating versions of jsonschema and lua-resty-radixtree; change it to install
exact pinned versions or switch to a lockfile-driven install. Update the
commands that install jsonschema and lua-resty-radixtree to include precise
version specifiers (pin known-good versions) or replace the step with a luarocks
lockfile install flow (using a generated luarocks.lock and invoking luarocks to
install from it). Ensure you modify the "Install Lua dependencies" step to
reference the rock names jsonschema and lua-resty-radixtree with those pinned
versions or the lockfile-based installation command.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f4dc1868-fdba-4c5c-87d0-ca26419ca97e
📒 Files selected for processing (2)
.github/workflows/fuzz-nightly.yml.github/workflows/fuzz.yml
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/fuzz.yml
- Remove 'simple' from ARRAY_STYLES (invalid for query params per OAS 3.0) - Handle enum/const generically in sample_value() before type branches - Clamp/swap bounds in integer sampling to prevent ValueError - Encode query arrays using style-appropriate delimiter - Resolve $ref pointers before generating cases - Wrap main fuzz loop in try/finally to always write summary.json - Validate FUZZ_BUDGET as numeric in nightly workflow - Fix gh issue list jq to use '// empty' for null handling - Add 'text' language tag to README fenced code block
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
.github/workflows/fuzz-nightly.yml (1)
63-66:⚠️ Potential issue | 🟠 MajorPin LuaRocks dependency versions for deterministic nightly runs.
Line 65 and Line 66 still install floating latest packages, so nightly failures can come from upstream releases rather than repo changes.
#!/bin/bash set -euo pipefail # Verify whether nightly workflow uses floating LuaRocks installs. rg -n '^\s*sudo luarocks install (jsonschema|lua-resty-radixtree)\s*$' .github/workflows/fuzz-nightly.yml🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/fuzz-nightly.yml around lines 63 - 66, The workflow installs floating LuaRocks packages in the "Install Lua dependencies" step; replace the two commands that currently run "sudo luarocks install jsonschema" and "sudo luarocks install lua-resty-radixtree" with pinned installs (explicit version strings or exact rockspecs) or reference workflow variables (e.g., JSONSCHEMA_VERSION and RADIXTREE_VERSION) and use them like "sudo luarocks install jsonschema <version>" and "sudo luarocks install lua-resty-radixtree <version>" so nightly runs are deterministic; update the step commands and add corresponding variables/inputs for the chosen versions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@fuzz/mutate_fuzz.py`:
- Around line 304-305: The code builds path_params from only operation-level
parameters (path_params = {p["name"]: p for p in (op.get("parameters") or [])
...}) but ignores path-item parameters; update the parameter collection to merge
path-item and operation-level parameters so path_params includes parameters from
both sources (e.g., combine path_item.get("parameters") and op.get("parameters")
before filtering), deduplicate by name+in, and then use that merged path_params
wherever the generator uses path_params (the blocks around the existing
path_params assignment and the later logic at lines ~323-346) so “positive”
requests include required path-item params as well.
- Around line 189-212: The resolver resolve_refs (and its inner helper _resolve)
currently follows $ref links without tracking which ref targets have been
visited, causing RecursionError on cyclical refs; update _resolve to accept and
propagate a visited set (e.g., visited_refs) and, when encountering a "$ref"
string, compute a stable key for the target (the ref string or the resolved path
parts) and check visited_refs before descending—if already visited, return the
node as-is (or a shallow copy) to break the cycle; otherwise add the key to
visited_refs before recursing into the resolved target and remove it after
returning so other branches remain unaffected.
- Around line 416-421: The code currently swallows JSON parsing errors in the
loop over r.stdout (for line in r.stdout.splitlines(): ... except Exception:
pass), which can hide malformed validator output; update the loop to catch
json.JSONDecodeError specifically, record the offending line(s) (e.g., add to a
parse_errors list or append raw lines to out_errors), and include that condition
when deciding failure: change the final check that uses r.returncode and out (if
r.returncode != 0 and not out) to also fail when parse_errors is non-empty
(e.g., if r.returncode != 0 or not out or parse_errors), and log the
parse_errors with context so malformed lines are visible (reference
variables/functions: r, r.stdout, r.returncode, out, json.loads).
---
Duplicate comments:
In @.github/workflows/fuzz-nightly.yml:
- Around line 63-66: The workflow installs floating LuaRocks packages in the
"Install Lua dependencies" step; replace the two commands that currently run
"sudo luarocks install jsonschema" and "sudo luarocks install
lua-resty-radixtree" with pinned installs (explicit version strings or exact
rockspecs) or reference workflow variables (e.g., JSONSCHEMA_VERSION and
RADIXTREE_VERSION) and use them like "sudo luarocks install jsonschema
<version>" and "sudo luarocks install lua-resty-radixtree <version>" so nightly
runs are deterministic; update the step commands and add corresponding
variables/inputs for the chosen versions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7021f23a-df6a-48b9-803b-6d08bd84a07d
📒 Files selected for processing (4)
.github/workflows/fuzz-nightly.yml.gitignorefuzz/README.mdfuzz/mutate_fuzz.py
✅ Files skipped from review due to trivial changes (1)
- .gitignore
- Add cycle detection in $ref resolution to prevent recursion blowups - Merge path-item and operation-level parameters for complete coverage - Report malformed JSONL output instead of silently swallowing parse errors
There was a problem hiding this comment.
♻️ Duplicate comments (2)
fuzz/mutate_fuzz.py (2)
491-495:⚠️ Potential issue | 🟠 MajorUse the same resolved spec for both case generation and validation.
Line 492 generates “positive” requests from
resolved, but Line 495 validates againstspec. This can desync the oracle and introduce false-negative noise. Passresolvedtorun_validatortoo.Suggested fix
- results = run_validator(spec, cases, args.deps, args.lib, args.include) + results = run_validator(resolved, cases, args.deps, args.lib, args.include)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@fuzz/mutate_fuzz.py` around lines 491 - 495, The validation is using the original spec variable while generated cases come from the transformed resolved spec, which can cause mismatches; update the call to run_validator to pass resolved instead of spec so the oracle matches the generated positive cases—i.e., after calling resolve_refs(spec) and gen_cases(resolved,...), call run_validator(resolved, cases, args.deps, args.lib, args.include) (referencing resolve_refs, gen_cases, run_validator, spec, resolved).
429-437:⚠️ Potential issue | 🟠 MajorDon’t ignore malformed JSONL when some lines are valid.
Current logic only emits
subprocess_errorwhenbad_lines > 0and no valid JSON lines exist. Mixed output still passes silently, which can hide runner/protocol issues.Suggested fix
- if bad_lines and not out: + if bad_lines: out.append({"phase": "subprocess_error", "rc": r.returncode, "stderr": f"malformed validator JSONL output ({bad_lines} lines)"})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@fuzz/mutate_fuzz.py` around lines 429 - 437, The current parsing loop in mutate_fuzz.py collects valid JSON lines into out but drops information when there are malformed lines if any valid lines exist; update the post-loop logic (the block handling r.stdout, out, bad_lines and the subsequent if r.returncode checks) so that whenever bad_lines > 0 you also append or merge an additional result entry documenting the malformed JSONL (e.g., {"phase":"subprocess_malformed_lines","bad_lines":bad_lines,"stderr":"malformed validator JSONL output"}) instead of only emitting when out is empty; keep existing valid parsed entries but ensure the malformed-lines entry is always added when bad_lines > 0 so mixed output is not silently ignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@fuzz/mutate_fuzz.py`:
- Around line 491-495: The validation is using the original spec variable while
generated cases come from the transformed resolved spec, which can cause
mismatches; update the call to run_validator to pass resolved instead of spec so
the oracle matches the generated positive cases—i.e., after calling
resolve_refs(spec) and gen_cases(resolved,...), call run_validator(resolved,
cases, args.deps, args.lib, args.include) (referencing resolve_refs, gen_cases,
run_validator, spec, resolved).
- Around line 429-437: The current parsing loop in mutate_fuzz.py collects valid
JSON lines into out but drops information when there are malformed lines if any
valid lines exist; update the post-loop logic (the block handling r.stdout, out,
bad_lines and the subsequent if r.returncode checks) so that whenever bad_lines
> 0 you also append or merge an additional result entry documenting the
malformed JSONL (e.g.,
{"phase":"subprocess_malformed_lines","bad_lines":bad_lines,"stderr":"malformed
validator JSONL output"}) instead of only emitting when out is empty; keep
existing valid parsed entries but ensure the malformed-lines entry is always
added when bad_lines > 0 so mixed output is not silently ignored.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 321bf6eb-5bc6-4ee2-b34e-d038f43d7239
📒 Files selected for processing (1)
fuzz/mutate_fuzz.py
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 9 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Update docstring: OpenAPI 3.x (not just 3.0) - Clarify crashes.jsonl contains both crashes and false negatives - Fix resolve_refs docstring (returns new tree, not in-place) - Fix README: length_on_array targets array only - Generate and log RNG seed for reproducibility
What
Adds a mutation-based fuzzer (
fuzz/mutate_fuzz.py) that runs the validator against AST-mutated copies of real-world OpenAPI specs and checks two oracles:validate_requestmust not throw a Lua error (caught withpcall).Changes
fuzz/mutate_fuzz.py— Python orchestrator with 6 mutation strategies, $ref resolution, schema-conforming request generator, and Lua subprocess harnessfuzz/seeds/— Two real-world OpenAPI specs (Discourse, Notion) as mutation seedsfuzz/README.md— Architecture, usage, and extension docs.github/workflows/fuzz.yml— PR CI: 120s fuzz budget.github/workflows/fuzz-nightly.yml— Nightly: 600s budget, auto-opens tracking issues on failureMakefile—make fuzztargetReview feedback addressed
simplefrom query param styles (invalid per OAS 3.0)enum/constgenerically insample_value()before type-specific branchesValueErrorpipeDelimited,spaceDelimited)$refpointers before generating test casestry/finallyto always writesummary.jsonFUZZ_BUDGETas numeric in nightly workflowgh issue listnull handling with// emptytextlanguage tag to README fenced code blockSummary by CodeRabbit
New Features
Documentation
Chores